Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Fix LOOKUP JOIN with limit #120411

Merged
merged 32 commits into from
Jan 29, 2025
Merged

Conversation

alex-spies
Copy link
Contributor

@alex-spies alex-spies commented Jan 17, 2025

Relates #118781.

This is like #115624, in spirit, but for LOOKUP JOIN.

For queries like

... | LOOKUP JOIN lookup_index ON key | LIMIT 10

the limit cannot be simply pushed past the join - but it can be duplicated past the join. The same currently happens with MV_EXPAND.

This PR solves this by leaving an explicit Limit plan node downstream from the Join (in addition to pushing down the limit), but marks it in a way that prevents being duplicated multiple times (which would cause infinite loops).

In contrast to the approach used for MV_EXPAND, the downstream limit is not internalized into the Join node, but instead we mark it with a boolean attribute that marks them as un-pushable past MvExpand and Joins. The fix for MV_EXPAND is updated and aligned with the approach for LOOKUP JOIN.

This approach was discussed with @luigidellaquila. I believe it is preferable rather than having a limit attribute on MvExpand and Join (+ subclasses), because it means that the logic of pushing down limits remains local to the Limit class, and the mapper remains oblivious to all this, too. Also, any optimization rules (current and future ones) that do anything with limits continue to work without additional special casing for MvExpand and Join. (PushDownAndCombineLimits is already affected by this in an edge case.)

@alex-spies alex-spies added auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v8.18.0 >non-issue labels Jan 21, 2025
Copy link
Contributor Author

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to add more optimizer tests, but the production code changes are complete, so I'm undrafting already.

@alex-spies alex-spies marked this pull request as ready for review January 21, 2025 16:40
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 21, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@astefan
Copy link
Contributor

astefan commented Jan 22, 2025

I've started playing with the code. Found something, putting it here, sorry for not digging through it due to time constraints. I've added some more rows to employees to make it more challenging (I hoped). Also, apologies for the messy test data:

{"index":{"_index":"employees","_id":1}}
{"text2":"red fox jumps over black dog", "emp_no":555, "languages": 1}
{"index":{"_index":"employees","_id":2}}
{"text2":"red fox", "emp_no":666, "languages": 2}
{"index":{"_index":"employees","_id":3}}
{"text2":"black dog", "emp_no":777, "languages": 3}
{"index":{"_index":"employees","_id":4}}
{"text2":null, "emp_no":0, "languages": null}
{"index":{"_index":"employees","_id":5}}
{"text2":"foo", "emp_no":[50005,50006], "languages": [2,3,120]}
{"index":{"_index":"employees","_id":6}}
{"text2":"bar", "emp_no":[50007,50008], "languages": [1,2,3]}
{"index":{"_index":"employees","_id":7}}
{"text2":null, "emp_no":[50009,50010], "languages": [3,5]}

from employees | rename languages as language_code | limit 15 | sort emp_no desc | lookup join languages_lookup on language_code | keep emp_no, language* | sort emp_no desc | limit 15

This one results in

status: 500 org.elasticsearch.xpack.esql.EsqlIllegalArgumentException: unknown physical plan node [OrderExec]
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:244)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planProject(LocalExecutionPlanner.java:618)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:209)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planLookupJoin(LocalExecutionPlanner.java:554)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:235)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planTopN(LocalExecutionPlanner.java:351)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:201)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planProject(LocalExecutionPlanner.java:618)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:209)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.planOutput(LocalExecutionPlanner.java:282)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:239)
        at org.elasticsearch.xpack.esql.planner.LocalExecutionPlanner.plan(LocalExecutionPlanner.java:180)
        at org.elasticsearch.xpack.esql.plugin.ComputeService.runCompute(ComputeService.java:316)

@alex-spies
Copy link
Contributor Author

Nice find @astefan . I can reproduce this, also with mv_expand.

However, the problem is already present on main (also for mv_expand). It is unrelated to this PR per se, and has to do with the fact that if there's more than one SORT, there's no LIMIT to be pushed into the upstream SORT, and thus we do not create a TopN out of it. For most commands, this is solved by pushing them down past the SORT, so that multiple SORTs end up next to each other and can be pruned.

I'll create a separate issue for this.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Alex, LGTM!

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Left two highly optional notes.

Copy link
Contributor Author

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a bunch of test cases in LogicalPlanOptimizer; more specifically, versions of all mv_expand tests that applied but with lookup join instead - barring those that would fail due to unbounded sorts (pointed out by @astefan here).

Comment on lines +2719 to +2723
ROW language_code = 1
| MV_EXPAND language_code
| LIMIT 1
| SORT language_code
| LIMIT 3
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case used to not correctly take into account the LIMIT 1 at the beginning of the plan because it was absorbed into the MV_EXPAND. Now that the LIMIT 1 is duplicated rather than pushed down + absorbed, the downstream LIMIT 3 realizes that its actually a LIMIT 1 due to the upstream LIMIT 1.

@alex-spies
Copy link
Contributor Author

Thanks a lot for the reviews everyone!

@alex-spies alex-spies merged commit 843f1b8 into elastic:main Jan 29, 2025
16 checks passed
@alex-spies alex-spies deleted the fix-lookup-with-limit branch January 29, 2025 09:57
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.x

alex-spies added a commit to alex-spies/elasticsearch that referenced this pull request Jan 29, 2025
For queries like 
... | LOOKUP JOIN lookup_index ON key | LIMIT 10
the limit cannot be simply pushed past the join - but it can be duplicated past the join.

In such cases, leave an explicit Limit plan node downstream from the Join (in addition to pushing down the limit), but mark it in a way that prevents being duplicated multiple times (which would cause infinite loops).

Align the logic for MV_EXPAND, which used to, instead, internalize a limit into the MvExpand node.
elasticsearchmachine pushed a commit that referenced this pull request Jan 29, 2025
* ESQL: Fix LOOKUP JOIN with limit (#120411)

For queries like 
... | LOOKUP JOIN lookup_index ON key | LIMIT 10
the limit cannot be simply pushed past the join - but it can be duplicated past the join.

In such cases, leave an explicit Limit plan node downstream from the Join (in addition to pushing down the limit), but mark it in a way that prevents being duplicated multiple times (which would cause infinite loops).

Align the logic for MV_EXPAND, which used to, instead, internalize a limit into the MvExpand node.

* Fix compilation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants